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Data series subtraction with unknown and unmodeled background noise 
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LISA Pathfinder (LPF), the precursor mission to a gravitational wave observatory of the European 
Space Agency, will measure the degree to which two test masses can be put into free fall, aiming to 
demonstrate a suppression of disturbance forces corresponding to a residual relative acceleration with a 
power spectral density (PSD) below (30 fm/s 2 /vHz) 2 around 1 mHz. In LPF data analysis, the 
disturbance forces are obtained as the difference between the acceleration data and a linear combination 
of other measured data series. In many circumstances, the coefficients for this linear combination are 
obtained by fitting these data series to the acceleration, and the disturbance forces appear then as the 
data series of the residuals of the fit. Thus the background noise or, more precisely, its PSD, whose 
knowledge is needed to build up the likelihood function in ordinary maximum likelihood fitting, is here 
unknown, and its estimate constitutes instead one of the goals of the fit. In this paper we present a 
fitting method that does not require the knowledge of the PSD of the background noise. The method is 
based on the analytical marginalization of the posterior parameter probability density with respect to the 
background noise PSD, and returns an estimate both for the fitting parameters and for the PSD. We 
show that both these estimates are unbiased, and that, when using averaged Welch’s periodograms for 
the residuals, the estimate of the PSD is consistent, as its error tends to zero with the inverse square root 
of the number of averaged periodograms. Additionally, we find that the method is equivalent to some 
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implementations of iteratively reweighted least-squares fitting. We have tested the method both on 
simulated data of known PSD and on data from several experiments performed with the LISA 
Pathfinder end-to-end mission simulator. 
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I. INTRODUCTION 

LISA Pathfinder (LPF) [1] is the precursor mission to 
a gravitational wave (GW) observatory of the European 
Space Agency (ESA). Its primary goal is that of 
assessing if a set of reference test masses (TMs) can 
be put into free motion, with residual accelerations, 
relative to the local inertial frame, having a power 
spectral density (PSD) less than (30 fm/s 2 /\/Hz) 2 , at 
frequencies between 1 and 30 mHz. This goal is 
pursued by measuring the relative acceleration of two 
TMs, separated by a nominal distance of 38 cm, along 
the line — whose direction we call x — joining their 
centres of mass (Fig. 1). The relative motion between 
the TMs, X\ 2 , is measured by means of a laser 
interferometer, the output of which s l2 = x n + n l2 is 
affected by a readout noise « 12 with less than 
(6 pm/\/Hz) 2 PSD at mHz frequencies. 

The relative acceleration a is then calculated by numeri- 
cally performing the second time derivative [2] of the 
interferometer output s l2 , 



FIG. 1 (color online). Schematic of LPF. The figure shows 
the reference TM, TM2, and the two laser interferometers — 
represented by their respective laser beam paths — that measure X\ 
and A'i 2 , respectively. The x axis, shown in the figure, is parallel to 
the line joining the centres of mass of the two TMs. The z axis, 
normal to the figure, points toward the Sun. Also shown are 
the electrodes used to apply the forces to TM2, necessary to keep 
it at nominally fixed distance from the reference TM. Similarly, 
the picture shows a pair of fiN thrusters that are used to force 
the satellite to stay at a nominally fixed position relative to the 
reference TM. Not shown in the figure are the electrodes and the 
piN thrusters used to control TM and satellite along degrees of 
freedom other than x. 


a = 


d 2 s \ 2 

~df~ 


d 2 x 12 
dt 2 


d 2 n \ 2 d 2 x 12 

~dF=^[F +ar ’ 


(i) 


where we have implicitly defined the readout acceleration 
noise a r . 

The TMs are not both free-falling along x. One TM, the 
inertial reference, is indeed following a pure geodesic orbit, 
but both the satellite, and the other TM, that we call TM2, 
are forced, by some force control loop, to stay nominally at 
fixed positions relative to the reference TM. The satellite is 
actuated by a set of /.iN thrusters within a feedback loop 
driven by the signal from a dedicated interferometer, which 
measures the relative displacement x x between the satellite 
and the reference TM. The second TM is instead subject to 
a weak electrostatic force commanded by a feedback loop 
driven by the main interferometer signal s 12 . 

The relative motion of the satellite and the TMs, along 
degrees of freedom other than x, is also measured, either by 
laser interferometers or by capacitive sensing, and con- 
trolled by a combination of electrostatic forces and torques 
on the TMs, and of //A-thruster-generated forces and 
torques acting on the satellite. 

In standard operations, control loops keep the relative 
motion small enough that the system is expected to behave 
linearly, obeying a set of linear dynamical equations [3]. 
For instance, the equation for a is 


a = 


E R t* 


d 2 S: 


dt 2 



+ 9 + a r . 

j 


( 2 ) 


The symbol * indicates time convolution. Rj in Eq. (2) is a 
linear operator that represents the unwanted pickup, by 
the differential interferometer, of generalized coordinates 
other than x 12 , like for instance X \ . These coordinates are 
measured by the signals Sj, just as s i2 measures the 
coordinate x 12 . Ideally Rj should be zero, but imperfections 
and misalignments make it nonzero. 

In principle R= acts on coordinates, not on signals. 
Substituting coordinates with signals, produces an extra 
term, as signals are always affected by some readout 
noise. We absorb this term into the overall readout 
noise a r . 

The readout noise, because of the second time derivation, 
raises, in power, with the frequency / as ~/ 4 , and is 
expected to dominate the data above some 30 mHz, thus 
setting LPF’s measurement bandwidth. 
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The generalized differential forces per unit mass, appear- 
ing on the right-hand side of Eq. (2) are split into three 
contributions: 

(i) The linear operator roj converts the relative motion 
of the TMs and the satellite, along any of the degrees 
of freedom, into a differential force along x. For 
really free-falling TMs, coj should be zero. However 
static force gradients within the satellite makes the 
diagonal coefficients nonzero, while various kind 
of imperfections and misalignments contribute to 
nondiagonal terms. 

(ii) The forces commanded by the control loops g c j, 
that are converted into true forces by the linear 
“calibration” operator Aj. Aj should be just 1 when j 
corresponds to the electrostatic force commanded 
along x on TM2, and zero for any other value of j, 
but deviates from that because of imperfect calibra- 
tion, delays and signal cross-talk. 

(iii) The random forces g stemming from all remaining 
disturbances, and whose measurement is the primary 
target of LPF. 

Measuring R ’ • and A, is one of the tasks of our analysis. 
Furthermore, as the coupling of the TMs to the satellite is 
expected to be present also in a GW observatory like eLISA 
[4], one of the goals of LPF is to give a measurement of oP 
to be compared with the prediction of the physical model of 
the system. 

The most important goal for LPF though, is that of 
measuring the PSD of g, the parasitic forces that act on TMs 
and push them away from their geodesic trajectories. 

Equation (2) suggests a natural way of achieving both 
these goals. Indeed both the sf s and the g c j ’s are known, as 
the first have been measured, and the second have been 
commanded by the control loops. Thus a fit of the s/s and 
the g c j’ s to a, returns Rj, aP, and Ay, as best fit parameters, 
but also allows the estimation of the PSD of g from the 
fit residuals, that is from the difference between the 
acceleration data series and the fitting model. 

In reality we need to perform such fits on the data from 
two different kinds of experiment. 

When the target is that of measuring, with comparatively 
high precision, the values of Rj, aP and Ay [see Eq. (2)], we 
perform dedicated calibration campaigns, where some 
proper guidance signals are injected into the appropriate 
control loops, so that the sf s and the g°j’s undergo large 
variations. This way Rj, op- and Aj can be measured with 
large signal-to-noise Ratio (SNR). 

When the target is instead a higher accuracy measure- 
ment of the PSD of the ultimate background acceleration 
noise, we do not apply any guidance signal, but just record 
acceleration noise data. These data are then fit to the sj’s 
and the gfj* s, with the aim of separating g from the effect of 
the other force terms in the right-hand side of Eq. (2). 
Indeed g becomes now the residual of the fit, that is, the 
difference between the acceleration data and the best fit 
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model. Actually, also other time series, like thermometer or 
magnetometer data, may be fitted to the acceleration data to 
detect and separate specific disturbance sources. 

It is worth stressing that, the sj * s and the g c y s cannot be 
turned off at any time so that an independent measurement 
of g cannot be performed. A similar situation would also 
hold for a GW detector like eLISA, where large signals are 
expected to dominate the data at all times so that an 
independent measurement of the background noise cannot 
be performed. 

To perform these fits we could not use a standard least- 
squares method, and we had to develop a different fitting 
method. Indeed, to perform a least-squares fit on data with 
colored background noise, as is certainly the case for LPF, 
one needs an a priori knowledge of the background noise 
PSD, either to set up a whitening filter, if the fit is performed 
in the time domain, or, for the more common case of a fit in 
the frequency domain, to assign the statistical weights to each 
fit residual. However, in our case, the PSD is not known 
a priori and is actually one of the targeted outputs of the fit. 

We have then developed a fitting method that works 
without an a priori knowledge of the background noise 
PSD. The method returns, besides the value of the fitting 
parameters, also an estimate for the background noise PSD. 
To achieve a comparatively high precision PSD estimation, 
the method preserves the ability of averaging over inde- 
pendent data stretches, like with the standard Welch’s 
averaged periodogram technique [5]. We use this method 
in the framework of Bayesian estimation. However, we 
show in the paper that it can also be extended to the 
standard “frequentist” fitting approach. 

Over the last few years, different authors, in the 
framework of GW detection and Bayesian parameter 
estimation, have addressed the problem of fitting without 
a complete a priori knowledge of the noise PSD [6-8]. The 
emphasis of these studies was mostly on minimising 
the bias that such a lack of knowledge may induce in 
the estimated signal parameters. This is a different target 
than the one we are discussing here, where the estimation 
of the noise is the main goal of the measurement, but the 
essence of the problem is the same. 

Two main approaches have been followed in these 
studies: 

(1) Within the first approach [6,9], the value of the noise 
PSD S k = S(f k ), at each discrete frequency 
f k =k/NT, with N the length of the data series 
and T the sampling time, is assumed to be described 
by some relatively smooth function of frequency, 
also depending on a vector of some adjustable 
parameters g, S k = S(f k ,rj). 

The likelihood of the fit residuals becomes then a 
function both of signal parameters and of rj. 

Appropriate prior probability densities — often 
some broad Gaussian or uniform densities — are then 
chosen for both the signal parameters and rj. Finally 
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the posterior likelihood for all parameters is numeri- 
cally derived by the Markov chain Monte Carlo 
(MCMC) technique. 

Once the global likelihood has been derived, the 
marginal likelihood of the signal parameters alone, 
can be derived by numerically marginalizing over 
the rj. 

(2) In the second approach [7,8] the values of the PSD at 
each discrete frequency S k , are considered as inde- 
pendent parameters of the likelihood, each one 
distributed with a prior in the form of a scaled 
inverse x 1 density, a family of distributions describ- 
ing the statistics of the reciprocal of the square of 
Gaussian variables, that depend on two characteristic 
parameters [8,10]. This way the posterior density of 
both the signal parameters and of the S k , becomes an 
analytical function of the observed residuals and of 
the prior parameters. Once the values for these prior 
parameters have been chosen for each frequency, 
and the authors discuss possible criteria for this 
choice, the likelihood can be calculated numerically 
by MCMC. 

Our approach is close to the one in point 2 above with the 
following main differences and/or extensions: 

(1) We adopt, for the S k , a family of priors that are 
uniform, either in the logarithm or in some small 
power of S k , over a wide, but finite range of values. 
These priors give a realistic representation of our 
knowledge on the residual noise of the system (see 
sect. II B). The infinite range counterpart of these 
uniform priors can be obtained from the scaled 
inverse x 2 family for particular values of the prior 
parameters. 

(2) With this assumption we are able to extend the 
method to the very important case where the time 
domain data are partitioned into (overlapping) 
stretches, so that the standard Welch’s averaged, 
and windowed, periodogram of the residuals can be 
used for the fit [5]. 

We show that, by using this approach, 

(a) The posterior likelihood can be analytically 
marginalized over the S k s so that the margin- 
alized likelihood of signal parameters, which 
takes a very simple form, can be easily 
calculated numerically by MCMC, or numeri- 
cally maximized within a standard fitting 
approach. 

(b) S k can then be estimated analytically, and this 
estimate is shown to be consistent, its error 
tending to zero as the inverse square root of 
the number of averaged periodograms. 

(c) The above estimate shows a slight bias that 
depends on the specific prior adopted, but this 
bias tends to zero linearly with the inverse of the 
number of averaged periodograms. 


The paper presents such a method and is organized as 
follows. In Sec. II we describe the method. In Sec. Ill we 
give a test of the method with synthetic data of known PSD, 
and we present a few examples of its application to the 
reduction of data from LPF end-to-end mission simulator. 
Finally, in Sec. IV we briefly discuss the results and the 
possibility of extending the method to signal extraction for 
the data of GW detectors. 

II. MAXIMUM LIKELIHOOD WITH 
UNKNOWN COLORED NOISE 

Though the method is general, for the sake of clarity we 
will continue to refer to the example of LPF. The main 
signal for LPF is the relative acceleration data series a[n]. 
We assume that the acceleration data series may be 
modeled as 

a[n\ =g m [n,d]+g[n]. (3) 

In Eq. (3) the data series g m [n , 6] consists of the samples 
of a linear combinations of measured^ signals, like for 
example wjsj or Ajtfj in Eq. (2). g m [n,Q\ may have been 
obtained by processing those signals by some algorithm 
that depends on a set of parameters 0 t that we have arranged 
into the vector 6. For instance, the force commanded on 
TM2 along x is described by Acfeif— r 2 ), where #£( t ) is the 
recorded commanded force. Thus 0 contains in this case the 
calibration amplitude A (nominally one) and the delay r 2 , 
with the effect of the latter calculated by numerical 
interpolation of the data. 

g[n] represents the residual differential force noise time 
series, the main objective of the measurement, though 
corrupted by the superposition of the time series of the 
readout noise a r [n]. As discussed before, Eq. (3) indicates 
that g[n] might be derived from g[n,0] =a[n] — g m [n,6], 
the time series of “residuals,” a series that depends para- 
metrically on 6. 

The equality in Eq. (3) is preserved by moving to the 
Fourier domain so that one can write 

g[k,0\ = d[k] - g m [k,d ], (4) 

where the tilde indicates a discrete Fourier trans- 
form (DFT). 

We define the DFT of a stretch of N data of any series 
y[n] as 

y[k] =-j=Y^y[n]w[n]e- inl ^. (5) 

In the transformation we have already included the 
multiplication of data by a properly selected spectral 
window w[n]. This is common practice in spectral estima- 
tion to avoid excess spectral leakage [11]. 
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A. Building up the likelihood function 

We now want to discuss the joint probability density 
function of the residuals g[k,0\, conditional to a specific 
choice of the values of the @s. We will assume here that all 
noise sources are Gaussian and zero mean, so that also the 
g[k, 6] are zero-mean and Gaussian. Large non-Gaussian 
noise in LPF, like glitches and spikes, can satisfactorily be 
treated as being constituted of signals and subtracted from 
the data with minimal corruption of the data, given the 
focus on the lowest frequencies. 

The residuals g[k\ are complex quantities. It is shown in 
Appendix A that, for \k\ > k 0 and \k — kf\ > k x , where k Q 
and k l are integers depending on the adopted spectral 
window, Re{g[£]}, Im{ <?[£]}, Re{g[£']}, and Im{g[£]} may 
be considered, with good accuracy, as all zero-mean, 
independent Gaussian variables. For instance, for the 
Blackman-Harris spectral window we commonly use in 
LPF data analysis, we show in Appendix A, that a safe 
assumption is k 0 = k\ = 4. The variances of Re{p[£]}, and 
of Im{ #[/:]}, are given by (see Appendix A) 

<%{§[*]} = ^{51*]} = 2 Sk ' ^ 


where S k is the frequency averaged discrete time PSD of 
g[n], at the frequency f k = k/NT defined as 


s ‘ = i/3 ( ^ 


w{4>-k^ 


d(f). 


(7) 


Here is the discrete-time Fourier transform of w[n\ 
and Sg((f>) is the discrete-time power spectral density of the 
infinite length g[n] series, from which the set of N data 
under analysis has been extracted. Notice that, if aliasing is 
avoided, then ~Sg(4>) = TS(f = cf)/2 jrT), where S(f) is the 
ordinary PSD of the continuous process g(t) of which 
g[n] = g(t = nT) constitute the series of the samples. 

Under the hypotheses above, the joint conditional 
probability density function of Re{g[fc]} and Im{p[£]} is 
given by 


Assuming that the data series have been partitioned into 
N s independent stretches, the probability density in Eq. (8) 
becomes 


w)=n 

keQ 


1 

(^F 7 


_ N I5MI 2 
e ’ s k , 


(9) 


where the bar represents an average over the N s stretches. 

In Eq. (9) we have written the probability density of the 
data as also being conditional on S. In a standard fitting 
procedure, these coefficients are assumed to be known. 
On the contrary, here we discuss the case where the 
components of S are unknown and must be estimated 
from the data. 

We now write down the posterior probability density 
of S and 0, 


p( g = P(g|ft5)xn tee m t )xP(g) 

9 f p(s \e, s) x x p 0)M 

(10) 

In Eq. (10) we have made the key assumptions that Si is 
independent of S; if i f j, and that both are independent 

of 0. This way the joint prior probability density P(0 , S) 
splits into the product of the separated prior probability 
densities P(0) and P(S k ). 

While the independence of 6 and S is rather natural, the 
physical basis for the independence of S', and St, when 
i ^ j, may need some justification. 

Our a priori knowledge of the noise PSD, in the case 
of an instrument where the signal cannot be ‘turned off’ 
and the noise independently measured, is rather limited. 
Unexpected lines could be present in the spectrum, such 
that nearby coefficients. S',- and S i±l , may differ even in 
order of magnitude. Thus even the perfect knowledge of 5,- 
would not give us any significant information on the 
probability density function of any of the other Sf s, which 
is the very definition of independent random variables. 


-» -* — r 1 |SM| 2 

s ‘ • w 

keQ^ k 

In Eq. (8), Q is the subset Q = {k 0 ,k 0 -\-ki,k 0 -\-2k\,....}, 
of the integer set 0 < k < N / 2. In the same equation, and 
in the rest of the paper, we have organized the residuals 
g[k, 0] and the S k , with k E Q, into the vectors g and S, 
respectively. 

It is standard, in spectral analysis, to partition data series 
into shorter stretches, and to average the spectral estimate 
over these stretches. Different stretches are treated as 
statistically independent, even in the presence of partial 
overlap between adjoining stretches, if these have been 
tapered at their ends with a proper spectral window. 


B. Fitting 

We now discuss the use of the likelihood function in 
Eq. (10) for the purpose of fitting. As most of our fits 
involve nonlinear functions of parameters, our preferred 
approach is that of Bayesian parameter estimation with the 
MCMC method, but we will also consider a more conven- 
tional approach wherein one searches for a maximum of the 
likelihood as a function of fitting parameters. We will show 
now that, whatever the selected approach, with a proper 
choice of the prior probability density of the components of 
S, the parameter space can be reduced to just that of 6. 

In order to do this, we assume that the prior of S k is 
uniform as a function of either some small power of S k , or 
of log(S*), between two values S ka <^S kb . This is not 
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exactly the same as using Jeffrey’s noninformative prior 
[12], though it may be approximated by this under some 
assumptions that we will make later. 

Such a choice for the prior, in particular a uniform 
density as a function of log^), again closely reflects our 
a priori physical knowledge of the noise. Indeed, as said, in 
LPF, as in a GW observatory like eLISA, signals cannot be 
turned off and the background noise cannot be independ- 
ently measured. Though we may have theoretical models 
for the sources of g, unexpected noise sources may lead to a 
PSD deviating from theoretical expectations even in order 
of magnitude. 

With this choice of the prior, P(0,S\g) can be analyti- 
cally integrated over the S space, to obtain a marginalized 
likelihood which is only a function of 0. By assuming that 
P{0) is bound to a domain on which \g[k]\ 2 <§; S kb , and 
that, for any value of 6, S ka «; |<?[&]| 2 , the integration over 
S k can be extended from zero to infinity: 


maximum of P(0,S\g). Within the Bayesian approach, 
the likelihood mapping may be performed over just 6, by 
USing P msig {6\g). 

In both cases, the results in Eqs. (12), (13), and (14) 
indicate that, for large enough N s the logarithm of the 
likelihood to be either maximized, or used in MCMC 
mapping, is 


A(0) =log(P raarg (0|5)) = -A^logtlsM]! 2 ) + C, 

keQ 

(15) 

instead of the standard least-squares fitting result with 
known S k , 


A(e) = -<£^r^+ c '- ( 16 ) 

k<=Q k 


= / P{e,s\ g )ds. 


(ii) 


By performing the integration we obtain 

“T m—N s 
I I , \n\lr H I 

P mMg(0\g) = 


iweMr 


mje Q m^]\Y~ Ns p(#)d& 


(12) 


for a uniform prior either in S k or in log (S k ), in which case 
one should put m = 0 in the formulas. 

In addition, the likelihood P(Q , 5jg), for any given value 
of 6, reaches a maximum P max (d\g) for some values S k ma]i 
of the S k s. Both the value of P max {d\g), and of S kiTnax 
can be calculated analytically by differentiation of Eq. (9). 
We get 


c — . 
u &,max 


N . 


N s — m + 


T l M • 


(13) 


and 


AnaxW) = 


e-(",+ 1 )(A^ Y + l)^ +1 -» , P marg (fl|g) 

- 1)1 


(14) 


We note that Eq. (13), in the limit of large N s , where all 
priors give the same formula, anticipates our result for 
the estimate of S k that we further discuss in detail in 
subsection II D. 

These results allow us to restrict the fitting to the 6 
parameter space. In the maximization approach, one can 
search for a maximum of P max {S\g), that will also be a 


Here, C and C' are just constants. In essence, according to 
Eq. (15), in the presence of unknown and unmodeled noise, 
any fit must minimize not the mean square residuals, but 
rather the sum of their logarithm. This is one of the main 
results of the paper. 

As already mentioned in the introduction, a likelihood 
proportional to that in Eq. (12) has been found for the 
special case N s = 1 and m = 0 by [8]. Our result general- 
izes it to the experimentally important case of averaged and 
windowed periodograms, and to the case of m < 0. 

For the rest of the paper we will call the likelihood in 
Eq. (15), the “logarithmic” likelihood (LL). 


C. Truly independent DFT coefficients 
and spectral resolution 

In order to maintain the accuracy of the result in Eq. (15), 
one should fulfil the condition \k — k*\ >k l , dropping a 
great number of DFT coefficients, and thus lowering the 
spectral resolution of the fit. This might be undesirable at 
the lowest frequencies, where the relative spectral reso- 
lution is rather low. The inaccuracy deriving from summing 
over all DFT coefficients in Eq. (15) reduces to counting 
each independent DFT coefficient more than once. Thus, 
using such a likelihood function may overestimate the 
number of degrees of freedom of the fit and may lead to an 
underestimate of the parameter errors. This effect may be 
corrected for by scaling down the likelihood by an 
appropriate^ correction factor y, that is by using the like- 
lihood A(^) = y x A(0). This is further discussed in 
Sec. IE. 

As for the coefficients with k < ko, in spectral estimation 
these are discarded in any case, because of the strong bias 
they suffer from the leakage of spectral power from 
frequencies below / = 1 / NT. 
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D. Summary of fitting procedure and the uncertainty 
on parameter estimates 

The derivation in the preceding sections leads to a very 
simple implementation of the method. The entire machi- 
nery of a standard frequency domain fit can be retained, 
provided that the usual likelihood function is replaced with 
that in Eq. (15). 

Thus, for instance, within the framework of Bayesan 
estimation, rather than sampling from the joint likelihood 
of 0 and S, parameter estimates of 0 are obtained from the 
marginalized LL likelihood, which can be numerically 
mapped over the space of the #’s by standard MCMC. 

Such a calculation does not return an estimate for the 
S k and their errors. These however can be estimated as 
follows. Let’s write the conditional mean value of any 
integer power of Sj, S'” as 

fp(d)ddtfs?PCg\d.s)pCs)ds 

Jp(d)Sj^pCg\e,s)p(s)ds ' 


<sj?|g} 


It is worth noticing that, by increasing N s , besides 
improving the precision of the noise estimate, one also 
rapidly suppresses any residual dependance of both the 
likelihood in Eq. (12) and the momenta in Eqs. (18) and 
(19), on m, the only remaining parameter in our approach 
that is still needed to characterize the noise. Indeed 
marginalization removes the dependance of the likelihood 
on the S k s but does not remove its dependence on 
parameters, like m, entering in the prior for S k , whose 
value must then be decided in advance (see for instance 
Ref. [8]). It appears then that averaging over many 
periodograms may be an effective way to remove such a 
residual bias. 

In this sense, neither the estimate for S k nor that for 0 are 
biased by noise modeling. 

Thus, in summary, the fitting proceeds in two steps: after 
having mapped the marginalized likelihood LL to estimate 
0 , S and their uncertainties can be evaluated, through 

Eq. (20), from the fit residuals evaluated for 0 = 0 O , where 
LL takes its maximum value. 


This integral may be somewhat reduced if one uses 
the prior for S discussed above. Indeed one can check 
(see Appendix B) that 

< 18 > 

and that 

09 ) 

Here the mean value (|5[/]| 4 ) is the value of \g\j, 0]| 4 
averaged over the total posterior probability density of 0. 
Thus, once this posterior probability has been calculated as 
described above, in principle also the values of Sj and of Sj , 
and then of the variance of Sj, can also be numerically 
calculated. 

The numerical calculation can be avoided if the param- 
eter posterior distribution is symmetric enough around the 
mean values, a condition we always find satisfied in our 
numerical calculations. In this case the mean values in the 
rightmost terms of the Eq. (18) and Eq. (19) will coincide 
with the values they take at the mean values 0 = 0 O where 
the likelihood is also a maximum. Then, for N s » 1, m: 


(S/15} -\g\j>0o]\ 2 

°Sj = \J {&]{§) - ( Sjfg } 2 = • ( 20 ) 

Equation (20) shows that the estimate of the 5,- is 
consistent, as its error decreases as 1 / y/N^. 


E. Relation to the iteratively reweighted 
least-squares (IRLS) method 

A popular method to perform least-square fitting 
without an a priori knowledge of the expected residual 
PSD, is that of performing the fit iteratively. The 
procedure starts by performing an ordinary least-squares 
fit, maximizing then the likelihood in Eq. (16), by using 
some arbitrary initial value for the S k , like for instance 
S k = 1. or alternatively by using the spectrum of the raw 
data before fitting. Once the maximum has been found at 

some point 0 O , the square modulus of the residuals of this 

= * y 

first fit, | g[k, 0 O ] | are used as the S k s in a new run of the 
fit. Often the residuals are smoothed over the frequency, 
or fitted to some smooth model before using them as 
weights in the next iteration of the fit. However the use of 
the residuals as they are is unbiased and numerically 
lighter, thus we prefer to restrain the discussion to just 
this case. The procedure is then iterated until the 
residuals of the fit and the S k do not change anymore 
to within some tolerance. The procedure usually con- 
verges quite rapidly. We will show here that the param- 
eter values that are obtained with this iterative procedure 
are the same as those obtained by a LL. 

Consider that, at the n th step of the procedure, one wants 
to minimize 


( 21 ) 

k & Q \~g[k,0 n _{]\ 

as function of 0. If the method converges, at some point 
the sequence becomes stationary and independent of n. 
Let us write 
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imi 2 = mA-i ]\ 2 + ( 22 ) 

and substituting in Eq. (21), we get 


& ~ ^E ( 1 + E mA '-^ A6> so) 

keQ^ i \g[k,0 n -i]\ ' 

= NgN s + E ] se h (23) 


where Nq is the number of elements in Q. If the sequence 
^ must become stationary, then from some value of n on, 
the derivatives in the right-hand side of Eq. (23) must 

become zero. Then N s ^2 ke Q\og(\g[k,$n-i]\ 2 ) reaches a 
maximum and hence so does the likelihood in Eq. (15). 

This shows that IRLS and LL achieve the same estimate 
for the best fit parameters. This will be also shown 
numerically in Sec. III. 

For nonlinear fits, when the maximum of the likelihood 
must be searched for numerically, the straightforward 
maximization of the likelihood in Eq. (15) is substantially 
quicker than the IRLS method, as the maximization with 
respect to the parameters must be performed only once. On 
the other hand, when the dependence of the likelihood on 
the parameters is linear, that is if the #’s are just amplitude 
coefficients, then the maximization, at each step, reduces to 
solving a system of linear equations and the IRLS method 
may be substantially faster. 

A theoretical estimate of parameter errors with the IRLS 
method, is not straightforward. For nonlinear fit models, 
once the stationary solution has been reached, one can use 
the residuals from this solution in place of the S k in the 
least-squares likelihood of Eq. (16) and perform an MCMC 
map, from which the parameter errors can be calculated. 
If the fit model is linear, one can apply the standard error 
propagation, or Fisher matrix technique, to the stationary 
solution and derive the errors from that. Though this is an 
heuristic approach, we will show in Sec. Ill, that, at least 
in the case of the nonlinear models that we have studied in 
the present paper, it gives results consistent with those 
obtained with the LL fits. 

Finally we want to make a couple of remarks: 

(i) The equivalence of the two methods shows that an 
IRLS fit that uses the single DFT coefficients from 
one fit to weight the next fit — as opposed to taking 
some smooth, few parameter model of the spectrum 
frequency dependence, such as a polynomial — 
implicitly assumes that the S k are uncorrelated 
and uniformly distributed in some logarithmic or 
power law space, as our LL method explicitly does. 

(ii) The assumption of uncorrelated and uniformly 
distributed S k allows for arbitrary variations of the 


PSD from one frequency to the next one and allows 
us then to weight neighboring frequencies quite 
differently based on the DFT of their residuals. 
The impact of this inhomogeneous weighting is 
limited in practice by averaging the spectrum over 
many time windows, which reduces the fluctuations 
in the mean square residuals between neighboring 
bins, and by using a small number of disturbance 
parameters 6. While we see no evidence of anoma- 
lies in the results of our analyses, we bear in mind 
that any assumption about the S k can have an impact 
on the analysis. Our technique allows for an inho- 
mogeneous weighting of points, as is consistent 
with our hypothesis of uncorrelated and uniformly 
distributed S k , and, unlike a low-order polynomial 
model of the spectrum frequency dependence, al- 
lows for lines, bumps, and other unexpected spectral 
features. 

III. APPLICATION TO DATA 

We have tested the method in two ways: 

(i) With the aim of verifying its accuracy, we have 
performed a noise decomposition exercise on a set of 
data generated from a fully known PSD. 

(ii) In order to test the practical usability of the method 
with nonideal data, we have used it in various data 
reduction exercises on the data from the LPF end-to- 
end mission simulator. 

In the next sections we describe these tests and report on 
their results, while in Sec. IV we discuss their significance. 


A. Test on data with a known PSD 

To perform this test we have generated some data series 
with spectral content and dynamic range similar to those 
of LPF data. A base data series has been created which is 
the sum of two components. The first, a[n\, simulating the 
acceleration data, is the numerical second derivative, 
obtained as described in [2], of a “red” noise series. 
This red series is composed of a 1 /f 6 PSD low frequency 
tail merging, at 10 mHz, into a flat PSD. The second series, 
cf{n\, which simulates the feedback force data, consists of a 
tail with 1 // 4 PSD merging into a flat plateau at 0.3 mHz. 
This base series is contaminated by independently super- 
posing a series eq , 0 ^ a [ n ] to a [ n ] before derivation and one 
a 2 , 0 bg[n\ to g c [n]. Sa[n] has a l/f 6 PSD, while Sg[n] has a 
band-pass structure with a broad peak around 2 mHz. Thus 
the data consist of 


g[n\ = a[n] +/[«]+ a lo + a 2 , 0 Sg[n\. (24) 

We have then prepared two data series to be used for the 
fit and the noise decomposition. 
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FIG. 2 (color online). Result of noise projection with simulated 
data. Upper noisy red line: the square root PSD of the base data 
series g[n]. Lower noisy black line: the square root PSD of the 
residuals after fitting g[n\, with a\ 4- ajSgln,^]. The solid 

blue line represents the theoretical value of the expected PSD. 

The first is the double derivative ^pr[«] of 8a[ri\. The 
second is a copy Sg[n, t\ of 8g[n\ but delayed by an 
amount r. 

A fit to g[n\ was then performed in the frequency domain 
with the fitting function 

/ ll M = «i M + « 2 8 g[n, r], (25) 

with a\, (X 2 , and r being the free parameters of the fit. 

All data series have been prepared with a sampling time 
of 10 Hz, and then, after having applied the appropriate 
delays, low-passed and decimated to 1 Hz. In analogy to the 
LPF data analysis, we further low-passed and decimated the 
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data to 0.1 Hz. We only fit the DFT coefficients with 
frequencies / < 50 mHz. 

Furthermore we have used N s =9 data stretches with 
50% overlap, and a Blackman-Harris spectral window with 
k Q = 4. Finally we have used k\ = 1. 

Fig. 2 shows a typical result of the fit. 

The fit has been performed with the MCMC method 
and the Metropolis-Hastings algorithm with the LL. The 
parameter values used for the plot in Fig. 2 correspond to 
the maximum of the likelihood for that specific realization 
of the data. An example of the marginalized MCMC 
distributions of the different fitting parameters is reported 
in Fig. 3. 

The simulation has been repeated N rep = 40 times. Each 
time we have generated a new data series to which we have 
applied the MCMC fit. For each repetition we have 
recorded the mean values and the standard deviations of 
the MCMC distributions of the three fitting parameters. 

For comparison we have also performed similar simu- 
lations for the following cases: 

(i) k { = 4 instead of k } = 1 . 

(ii) ki = 1 and y = 1/2. 

(iii) k\ = 1 and y = 1/3. 

(iv) ki = 1, but with the IRLS method. In this case, in 
each simulation, the IRLS fit has been performed 
with a numerical minimization routine until a sta- 
tionary solution has been reached. Then an MCMC 
sequence has been generated by using the likelihood 
in Eq. (16), with the S k equal to residuals of the 
stationary solution. 

The results are summarized in Table I. 

It is worth pointing out the following facts that can be 
observed in Table I: 

(i) For all fitting parameters, the average values result- 
ing from the different fitting methods, agree with 
each other and, with the true values, within their 
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FIG. 3. Marginalized histograms for the three fitting parameters, obtained with the MCMC method and the LL. 
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TABLE I. Results from simulations. The meaning of the columns is the following. True: parameter values used in the simulation. 
Columns under the header “LL” refer to fits performed with the LL method. Columns under the header TRLS’ refer to fits performed 
with the IRLS method. Av.: average of the 40 mean values of the MCMC distributions obtained in the 40 independent simulations, 
^sample • standard deviation of the 40 mean values. <7mcmc : root mean square of the standard deviations of the MCMC distributions of the 
40 independent simulations. 








LL 







IRLS 



*i 

= lr 

= 1 

k i 

= 4, y 

= 1 


= i, r = 

1/2 


= i, r = 

1/3 

*1 

= !. Y 

= 1 

Param. True 

Av. 

^sample 

0MCMC 

Av. 

^sample 

<?MCMC 

Av. 

"sample 

0MCMC 

Av. 

^sample 

0MCMC 

Av. 

^sample 

^MCMC 

«i 300 

300 

7.9 

5.8 

300 

12 

12 

302 

10 

8.4 

301 

8.9 

10 

297 

9.8 

5.5 

a 2 2 

2.005 

0.026 

0.014 

2.004 

0.029 

0.027 

2.005 

0.021 

0.019 

2.001 

0.026 

0.023 

2.000 

0.025 

0.013 

t[j] 0.8 

0.84 

0.29 

0.18 

0.70 

0.42 

0.36 

0.79 

0.34 

0.26 

0.77 

0.37 

0.31 

0.90 

0.33 

0.17 


respective errors <T sam pie/ yJN iep , with the exception 
of the results of IRLS simulation. These results 
however agree with the rest at worst to within 
1 -8 Sample/ \/^rep- In particular this confirms that 
the LL method produces an unbiased estimate of the 
parameters. 

(ii) On the contrary, for y = 1, <7 samp]e and <Tmcmc agree 
for all parameters, within their relative uncertainties, 
only for the case k\ — 4. In this last case however, 
both <7 samp ie and ^mcmc are somewhat larger than 
those obtained with k\ = 1, both for case of the LL 
and for that of IRLS. 

(iii) The agreement between <7 sam pie and <Tmcmc ma y be 
recovered also for k\ = 1, if ^ < y < 

(iv) The IRLS procedure followed by the MCMC like- 
lihood map, takes approximately 4 times the time 
needed by the straightforward LL, MCMC map. 

B. Application to LPF simulator data analysis 

As is customary with space missions, an end-to-end 
mission simulator of LPF has been set up by industry [13]. 
The simulator includes not only the linear dynamics of 
satellite and TMs translation, but also realistic, nonlinear 
models of the critical parts of the system, like the 
electrostatic actuation system, the rotational dynamics of 
both the TMs and the satellite, the interferometers, etc. 
These nonlinearities are not expected to play a significant 
role, as all displacements, velocities, etc. are expected to be 
very small during science operations. We expect then to be 
able to understand the largest part of LPF results within a 
linear model, and to have to deal with nonlinearities only 
as occasional small deviations from the linear regime. 
The simulator is proving extremely useful for testing 
this approach and the related data analysis algorithms 
and tools [14]. 

In this section we illustrate the application of the method 
described in Sec. n, to some selected cases, taken from the 
extensive simulation campaigns that have been performed 
in preparation for the mission operations. Specifically, we 
first discuss a simulated instrument calibration, where large 
calibration guidance signals are injected in the proper 


control loops. We then give an example of true noise 
decomposition performed to hunt for the source of some 
extra noise found in the simulated data. 

1. Extraction of calibration signals 

As explained in Sec. I, TM2 is forced, by a weak 
electrostatic force control loop driven by the main inter- 
ferometer signal 5 12 , to stay nominally at a fixed distance 
from the reference TM. 

This loop compensates most of the low frequency 
(< 10 mHz) forces g by applying commanded forces g c . 
Thus, accurate subtraction of these applied forces, includ- 
ing calibration of the actuator, is needed to extract the 
disturbance forces. Our analysis subtracts the commanded 
force series multiplied by a calibration factor, Acfo. A is 
extracted with a dedicated “system identification” experi- 
ment in which a comparatively large modulated guidance 
signal is added to the measured displacement signal, s 12 , 
in the control loop. The injected signal has the effect of 
modulating the distance between the TMs by exerting 
forces much larger that those exerted by the loop in the 
presence of just the background noise. 

Our technique to calibrate the transfer function consists 
of fitting the pre-processed commanded force series 
Ag c 2 [n,6\ to a[n]. As already mentioned, by ‘pre-processed’ 
we mean here that the series are filtered via an algorithm 
that depends on some set of parameters, 0 , that become free 
fitting parameters. Commonly for the (f 2 [n , 0] series, 0 only 
includes the amplitude, A, and a delay, r, but single pole 
filters that simulate the response of the actuators have also 
been tested in the past [14-17]. In addition to the feedback 
force removal, we also subtract, from the acceleration data, 
the forces due to the motion of both TMs within the static 
force gradient in the satellite [see Eq. (2)]. This is predicted 
to be dominated by the electric gradient due to various 
voltages applied between the TMs and their surroundings, 
and by the gravitational gradient. As already explained, 
these forces are proportional to the TM displacements 
relative to the satellite, measured by s j and s n - In the 
calibration experiment these displacements are also large, 
so that the effect of the respective forces may have 
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FIG. 4 (color online). Deviation of the best fit parameter 
obtained at each step of the IRLS procedure, from those obtained 
with the MCMC mapping of the LL. Deviations are divided by 
the standard deviation obtained from the MCMC chain of the LL. 
Colors mark the different parameters. Red circles: A. Blue 
triangles: —col- Black diamonds: — (<Wj — col). Brown stars: t. 

appreciable SNR. As a consequence, we include in the 
fitting model two terms proportional to s\ and to 5 12 , 
respectively. In conclusion we fit the acceleration data 
series assuming the following model, 

a[k\ = Ag c 2 [k , r] - ais u [k] - ( \ - col)~s x [£] + g[k], (26) 

with the amplitudes A, col, (®i — 0)1), and the delay r as 
free fitting parameters. 

For all the fits, we have used the Blackman-Harris 
spectral window, taken N s — 5 and limited the frequency 
range to / < 50 mHz. 

For the sake of comparison, the fit has been performed 
with both the LL and the IRLS methods. In this last case, 
the convergence of the best-fit parameter values to those 
obtained with the LL method has been verified and the 
results are reported in Fig. 4. 

Furthermore, Fig. 5 shows the PSD of the residuals of the 
two different fits. 

Finally, in Table II we report, for the same fits, the 
parameter values and their uncertainties. 



Frequency[Hz] 

FIG. 5 (color online). PSD of various data series. The upper, 
blue, noisy continuous line is the PSD of a[n] before the fit. The 
lower, black, noisy continuous line, barely visible behind the red 
dashed line, is the PSD of the residuals of the fit with the LL. The 
red dashed line is the PSD of the residuals of the IRLS fit. 


In the case of the IRLS fit, as before, we have first run 
the reweighted iteration, performing, at each step, just a 
numerical maximization of the likelihood in Eq. (16). After 
obtaining the stationary solution for the S k , we have 
inserted these values back in the likelihood of Eq. (16) 
and performed an MCMC map to estimate the parameter 
errors reported in II. As expected, the results of the two 
methods are indistinguishable. 

It must be stressed that, unfortunately, we cannot 
straightforwardly compare the results of Table II with 
the “true” parameter values used in the simulator. Only 
the parameter A = 1 .05 has a fixed and traceable value, 
while both col and coj are heuristic single-parameter 
approximations of a complicated model where the electrical 
fields, one of the major sources of force gradient, are 
dynamically calculated as part of the overall three body 
dynamics of the TMs and the satellite. This is also the case 
for t that results from the combination of various propa- 
gation delays within the system. All that said, the values 
resulting from the fit agree with the expectation, within 
their respective uncertainties. 


TABLE II. Parameter values from the various fits of Fig. 5. “LL” refers to the fit performed with the LL. “IRLS” to 
the fit done with the IRLS method. “Mean” indicates the mean of the MCMC parameter distribution, a represents its 
standard deviation. 


Parameter 

LL 


IRLS 


Mean 

a 

Mean 

a 

1 

S 

-o to 

i 

2.259 x 10" 6 

1 x 10 -9 

2.259 x 10" 6 

1 x 10" 9 

{col 2 ] 

7.2 x 10" 7 

1.2 x 10 -7 

7.1 x 10 -7 

1.2 x 10" 7 

A 

1.04998 

1.2 x 10" 5 

1.04998 

1.2 x 10" 5 


-4.002 x 10" 1 

3 x 10 -4 

-4.002 x 10" 1 

3 x 10- 4 
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2. Decomposition of low-frequency noise 

The second case of study deals with a typical case of 
noise hunting and decomposition. We started by forming a 
time series for g[n], subtracting the commanded force and 
stiffness effects as calibrated in the previous section. The 
PSD of g[n] at frequencies below 1 mHz was found in 
excess of what was expected from the physical model at the 
basis of the simulator. However calculating the expected 
PSD is not straightforward, given the complexity of the 
simulator model, so that no firm conclusion could be made 
about this noise being a real property of the system or rather 
a software artefact. 

To help in identifying the source, we performed an 
extensive campaign of decomposition, fitting all available 
data series to g[n\ in an attempt to identify the source. Not 
knowing what to expect as a background noise, we were 
forced to develop the methods we are discussing here. Only 
three data series were observed to significantly reduce the 
noise when subtracted from g[n]. The first and most 
important series was the difference A F, between the forces 
applied by the control system to the two different TMs 
along the z axis (see Fig. 1 for the definition of the axes). 
We also found a smaller but significant contribution of 
A F y , the corresponding difference of force along y. We 
finally found another small contribution from the torque N x 
applied to the inertial reference TM along the x axis. 
Figure 6 shows the effect of subtraction. 

In the simulator model, these forces should leak into x 
via some linear coupling coefficients, the values of which 
are known simulator input parameters. We found that the 
values of the coupling coefficients resulting from the fit are 
in quantitative agreement with those in the simulator, for 
A F y and N x (see Table III). The Table also shows that, on 
the contrary, the coupling coefficient for A F. is times 



FIG. 6. The effect of subtraction of A F z , A F y , and N x . The 
dotted line represents the PSD of the data series before sub- 
traction. The solid line represents the PSD of the data series after 
subtraction, i.e. the series of the fit residuals. 


TABLE III. Parameter values from LL noise decomposition. 
Mean and a are the mean and standard deviation from the MCMC 
chain. “Expected” refer to the values in the simulator. 


Disturbance 

Coupling coefficients 

Mean 

a 

Expected 

tiF z 

-8.03 x 10" 3 

6 x 10" 5 

1.1 x 10" 3 

A F y 

1.3 x lO" 3 

2 x lO" 4 

1.1 x 10" 3 

N x 

7.9 x lO - 5 m" 1 

9 x 10 “ 6 m“ 1 

7.7 x 10- 5 m-> 


larger than the corresponding one used in the simulator 
and, in addition also has the wrong sign. 

The named forces and torques are commanded by a 
controller in charge of stabilizing the absolute orientation 
of the satellite. Explaining how this controller works goes 
beyond the scope of the present paper. For the sake of the 
discussion it is only useful to mention that the controller 
is driven by the signals from a set of Autonomous Star 
Trackers, so that our finding allowed us to trace the 
problem back to an erroneous coupling of these devices. 

IV. DISCUSSION AND CONCLUSIONS 

The results of the tests reported in Secs. Ill A and ID B 1 
show that 

(i) The PSD of the residuals of the fit is in quantitative 
agreement with the expected spectrum for the back- 
ground noise. 

(ii) The LL is well behaved and produces unbiased 
results when used in a MCMC fit. 

(iii) The LL MCMC fit, and the IRLS fit followed by a 
MCMC likelihood mapping, give the same results, 
this last one being substantially slower in the case of 
nonlinear fitting. 

(iv) The estimate of the parameter errors, from the fit 
performed with k\ = 1 , seems indeed to be moder- 
ately affected by the correlation between nearby 
DFT coefficients. This bias, as expected, is common 
both to LL and IRLS fitting. 

(v) The bias can indeed be made negligible by taking 
properly spaced coefficients, or by correcting the 
likelihood with the proper factor y. 

We believe then that the LL fitting presented here is of 
general use and solves the problem of the lack of a priori 
knowledge of the target noise in frequency domain fits. 
From the point of view of the computational load, for 
multiparameter nonlinear fitting, the method is faster than 
IRLS. However the method is intrinsically nonlinear, and 
fitting would not reduce to a set of algebraic equations, as 
IRLS does, when the fitting parameters are just multipli- 
cative amplitudes. Given that the final results of the two 
method coincide, the IRLS method is preferable in the 
linear model case. 

We think that the LL approach is numerically lighter than 
the methods [6,18] that employ a parametric model for the 
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noise in the likelihood in Eq. (8), and then sample the 
likelihood with MCMC over a parameter space that also 
includes the noise model parameters. Indeed, with that 
method the likelihood function contains more terms, the 
sum of the squares and the sum of the logarithm of the S k , 
and the parameter space to be searched numerically 
is wider. 

We think that the method used here, including the 
partitioning of the data in stretches and the average over 
the stretches, could be usefully extended also to case of 
signal extraction from GW detectors. This is particularly 
true in the case of a space borne detector like eLISA, which 
is expected to be signal dominated at all times, so that a 
direct measurement of the instrument noise is difficult. An 
unbiased estimator of the noise could then help in avoiding 
the introduction of unwanted bias in the signal parameter 
estimation. 
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APPENDIX A: PROPERTIES OF 
DFT COEFFICIENTS 


The consequence of Eqs. (A2) and (A3) is that if k is 
large enough so that w((f) — k^j) has no overlap with 
w(# + then 

(Re{£[fc]} 2 ) = (Im{p[/:]} 2 ) = ^(\g[k]\ 2 ) = ^S k . (A4) 
In addition one can calculate that 

<Re{s[&] }Im{g[&] } ) 

= (#-*£) }■ 

(A5) 

So that again, if k is large enough, then 

(*te{§[*]}M§[*]}> = °- ( A6 ) 

Finally it is also straightforward to get that 

(5MS1*']) d<PSg{<P) 

xw \^- k 2 4 ) w ^- k, 1 4 )' (A7) 

(g[%W]) =^J #%( 0 ) 



It is a well-known result [11] that 

(l§M I 2 ) = S k- (Al) 

With the same kind of calculations that lead to Eq. (Al), 
one can easily calculate that 


Thus, if k and k! are spaced enough that w((/) — k^) 
has no overlap with w((f) — kf^), then g[k] and g[k!] are 
independent random variables. 

The amount of spacing needed for all the conditions 
above to hold depends on the selected spectral window 
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FIG. 7. The spectral window for the Blackman-Harris plotted 
for different values of k, spaced by k Q = 8. 
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Ak 


FIG. 8. The absolute value of the correlation coefficient defined 
in Eq. (A9), as a function of the DFT coefficient spacing Ak. 
Values are calculated for the Blackman-Harris window in the case 
of white noise. 


w[n\. In general there exists a common value k 0 for which, 
with good accuracy, Eq. (A4) holds if \k\ > y, and g[k\ are 
independent random variables if \k — k'\ >k Q . With the 
Blackman-Harris window used in the numerical calcula- 
tions of the present paper, k 0 = 8 (see Fig. 7). 

The correlation coefficients resulting from Eq. (A7), 
may be calculated explicitly, for nearby DFT coefficients, 
assuming the noise is white. In Fig. 8 we report the values 
for 


\p(Ak)\ = 


l(Re{g[k]}Re{g[fc + Ak]})| 
V<|Re{^[k]}|^)(|Re{#+Ak]}R 
l(Im{p[k]}Im{p[k + Ak]})| 
\/(|Im{#]}| 2 3 4 5 )(|Im{p[k+ Ak]}| 2 ) 


(A9) 
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For the first few values of Ak. In addition one can 
calculate that 


K Re {g[£]}Im{g[i: + AA']})| 

v/(l Im {9W}|')(l Re {#+ A*]}|-) 


(A10) 


Thus if noise does not vary too much over a range of order 
of k 0 , a reduced spacing among coefficients may also be 
considered. In the case of the Blackman-Harris window, 
Fig. 8 suggests that coefficients could be taken every 
Ak ~ 4 and still be treated as basically independent, and 
that even for Ak ~ 2-3 the effect of the correlation may still 
be negligible. 


APPENDIX B: ADDITIONAL FORMULAS 

To get the result in Eqs. (18), and (19), it is sufficient to 
substitute P(g\0, S) taken from Eq. (9) into Eq. (17) and 
perform the integral over Sj. We get 


(Sj\g) 

n s I iffL/. q\ r n te0 m. o\ \~ 2 )~ [N, ~ m) p{e)dd 

Ns ~ m ~ l /IWIff^ 0\\~ 2 f {N '~ m) P(O)dd 
= (Bl) 


(S?ll> = 


jv? 

(N s — m — 1 )(N S — m — 2) 




2 \-( N s-' n ] 


P{0)d6 


m^ Q mA\~ 2 )~ {Ns ~ m) pCo)de 


N 2 s 


(N s — m — 1 )(N S — m — 2) 


(B2) 
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